Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix flaky sanity test for TestAgentStatus #441

Merged
merged 2 commits into from
Dec 3, 2024
Merged

Fix flaky sanity test for TestAgentStatus #441

merged 2 commits into from
Dec 3, 2024

Conversation

musa-asad
Copy link
Contributor

@musa-asad musa-asad commented Dec 3, 2024

Description of the issue

We have a flaky sanity check in our integration tests when verifyUnixCtlScript.sh: https://github.com/aws/amazon-cloudwatch-agent/actions/runs/10724861249/job/29741856151#step:7:1248. Although this doesn't cause test failures, it can be a red herring when reading logs.

2024-09-05T17:17:50.0818601Z �[0m�[1mnull_resource.integration_test_run (remote-exec):�[0m �[0m****** processing amazon-cloudwatch-agent ******
2024-09-05T17:17:50.0820072Z �[0m�[1mnull_resource.integration_test_run (remote-exec):�[0m �[0mall amazon-cloudwatch-agent configurations have been removed
2024-09-05T17:17:50.0821670Z �[0m�[1mnull_resource.integration_test_run (remote-exec):�[0m �[0mIn step 7, cwa_running_status is NOT expected. (actual=stopped; expected=running)
2024-09-05T17:17:50.0823223Z �[0m�[1mnull_resource.integration_test_run (remote-exec):�[0m �[0m    sanity_unix.go:17: Running sanity check failed
2024-09-05T17:17:50.0824572Z �[0m�[1mnull_resource.integration_test_run (remote-exec):�[0m �[0m--- FAIL: TestAgentStatus (98.56s)
2024-09-05T17:17:50.0825686Z �[0m�[1mnull_resource.integration_test_run (remote-exec):�[0m �[0mFAIL
2024-09-05T17:17:50.0827097Z �[0m�[1mnull_resource.integration_test_run (remote-exec):�[0m �[0mFAIL	github.com/aws/amazon-cloudwatch-agent-test/test/sanity	98.569s
2024-09-05T17:17:50.0828418Z �[0m�[1mnull_resource.integration_test_run (remote-exec):�[0m �[0mFAIL

This is caused by a race condition between the agent ctl commands finishing and the status properties.

Description of changes

  • Add a sleep function to prevent the race condition between the agent ctl commands finishing and the status properties.

License

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Tests

Ran script on an EC2 instance with agent installed.
Screenshot 2024-08-14 at 4 44 16 PM

@musa-asad musa-asad requested review from movence and lisguo December 3, 2024 20:55
@musa-asad musa-asad self-assigned this Dec 3, 2024
@musa-asad musa-asad marked this pull request as ready for review December 3, 2024 21:26
@musa-asad musa-asad requested a review from a team as a code owner December 3, 2024 21:26
@musa-asad musa-asad merged commit b55ee08 into main Dec 3, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants